智能论文笔记

Toward an understanding of the properties of neural network approaches for supernovae light curve approximation

Mariia Demianenko , Konstantin Malanchev , Ekaterina Samorodova , Mikhail Sysak , Aleksandr Shiriaev , Denis Derkach , Mikhail Hushchyn

分类：机器学习

2022-09-15

现代时间域的光度测验收集了许多天文学对象的观察结果，大规模调查的即将到来的时代将提供更多信息。大多数对象从未接受过光谱随访，这对于瞬态尤其至关重要。超新星。在这种情况下，观察到的光曲线可以提供负担得起的替代方案。时间序列被积极用于光度分类和表征，例如峰值和光度下降估计。但是，收集的时间序列是多维的，不规则地采样，包含异常值，并且没有明确定义的系统不确定性。机器学习方法有助于以最有效的方式从可用数据中提取有用的信息。我们考虑了基于神经网络的几种光曲线近似方法：多层感知，贝叶斯神经网络以及使流量正常化，以近似单光曲线观察。使用模拟的Parperc和Real Zwicky瞬态设施数据样本的测试表明，即使很少有观察值足以拟合网络并获得比其他最新方法更好的近似质量。我们表明，这项工作中描述的方法具有比高斯流程更快的计算复杂性和更快的工作速度。我们分析了旨在填补光曲线观察中空白的近似技术的性能，并表明使用适当的技术会提高峰值发现和超新星分类的准确性。此外，研究结果是在GitHub上可用的Fulu Python库中组织的，该库可以很容易地由社区使用。

translated by 谷歌翻译

Latent Neural Stochastic Differential Equations for Change Point Detection

Artem Ryzhikov , Mikhail Hushchyn , Denis Derkach

分类：机器学习

2022-08-22

变更点检测算法的目的是定位过程的时间演变的突然变化。在本文中，我们介绍了潜在神经随机微分方程的应用，以解决变化点检测问题。我们演示了模型在一系列合成和现实世界数据集和基准测试方面的检测功能和性能。大多数研究的方案都表明，所提出的算法的表现优于最先进的算法。我们还讨论了这种方法的优势和局限性，并指示了进一步改进的方向。

translated by 谷歌翻译

Supernova Light Curves Approximation based on Neural Network Models

Mariia Demianenko , Ekaterina Samorodova , Mikhail Sysak , Aleksandr Shiriaev , Konstantin Malanchev , Denis Derkach , Mikhail Hushchyn

分类：机器学习

2022-06-27

由于天文学中的大数据实时处理，超新星的光度数据驱动分类成为挑战。最近的研究表明，基于各种机器学习模型的解决方案质量卓越。这些模型学会使用其光曲线作为输入来对超新星类型进行分类。预处理这些曲线是一个关键的步骤，严重影响最终质量。在本次演讲中，我们研究了多层感知器（MLP），贝叶斯神经网络（BNN）的应用，并将流动（NF）归一化为单个光曲线的观测值。我们将这些近似值用作超新星分类模型的输入，并证明所提出的方法的表现优于基于适用于ZWICKY瞬态设施的亮点的高斯工艺的最新方法。 MLP表现出与高斯工艺相似的质量和速度增加。就近似质量而言，标准化流量也超过了高斯过程。

translated by 谷歌翻译

Feature learning in neural networks and kernel machines that recursively learn features

Adityanarayanan Radhakrishnan , Daniel Beaglehole , Parthe Pandit , Mikhail Belkin

分类：机器学习 | 人工智能

2022-12-28

Neural networks have achieved impressive results on many technological and scientific tasks. Yet, their empirical successes have outpaced our fundamental understanding of their structure and function. By identifying mechanisms driving the successes of neural networks, we can provide principled approaches for improving neural network performance and develop simple and effective alternatives. In this work, we isolate the key mechanism driving feature learning in fully connected neural networks by connecting neural feature learning to the average gradient outer product. We subsequently leverage this mechanism to design \textit{Recursive Feature Machines} (RFMs), which are kernel machines that learn features. We show that RFMs (1) accurately capture features learned by deep fully connected neural networks, (2) close the gap between kernel machines and fully connected networks, and (3) surpass a broad spectrum of models including neural networks on tabular data. Furthermore, we demonstrate that RFMs shed light on recently observed deep learning phenomena such as grokking, lottery tickets, simplicity biases, and spurious features. We provide a Python implementation to make our method broadly accessible [\href{https://github.com/aradha/recursive_feature_machines}{GitHub}].

translated by 谷歌翻译

Less is More: Parameter-Free Text Classification with Gzip

Zhiying Jiang , Matthew Y. R. Yang , Mikhail Tsirlin , Raphael Tang , Jimmy Lin

分类：自然语言处理

2022-12-19

Deep neural networks (DNNs) are often used for text classification tasks as they usually achieve high levels of accuracy. However, DNNs can be computationally intensive with billions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that's easy, light-weight and universal in text classification: a combination of a simple compressor like gzip with a $k$-nearest-neighbor classifier. Without any training, pre-training or fine-tuning, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distributed datasets. It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also performs particularly well in few-shot settings where labeled data are too scarce for DNNs to achieve a satisfying accuracy.

translated by 谷歌翻译

On Noisy Evaluation in Federated Hyperparameter Tuning

Kevin Kuo , Pratiksha Thaker , Mikhail Khodak , John Ngyuen , Daniel Jiang , Ameet Talwalkar , Virginia Smith

分类：机器学习

2022-12-17

Hyperparameter tuning is critical to the success of federated learning applications. Unfortunately, appropriately selecting hyperparameters is challenging in federated networks. Issues of scale, privacy, and heterogeneity introduce noise in the tuning process and make it difficult to evaluate the performance of various hyperparameters. In this work, we perform the first systematic study on the effect of noisy evaluation in federated hyperparameter tuning. We first identify and rigorously explore key sources of noise, including client subsampling, data and systems heterogeneity, and data privacy. Surprisingly, our results indicate that even small amounts of noise can significantly impact tuning methods-reducing the performance of state-of-the-art approaches to that of naive baselines. To address noisy evaluation in such scenarios, we propose a simple and effective approach that leverages public proxy data to boost the evaluation signal. Our work establishes general challenges, baselines, and best practices for future work in federated hyperparameter tuning.

translated by 谷歌翻译

Solving Sample-Level Out-of-Distribution Detection on 3D Medical Images

Daria Frolova , Anton Vasiliuk , Mikhail Belyaev , Boris Shirokikh

分类：计算机视觉

2022-12-13

Deep Learning (DL) models tend to perform poorly when the data comes from a distribution different from the training one. In critical applications such as medical imaging, out-of-distribution (OOD) detection helps to identify such data samples, increasing the model's reliability. Recent works have developed DL-based OOD detection that achieves promising results on 2D medical images. However, scaling most of these approaches on 3D images is computationally intractable. Furthermore, the current 3D solutions struggle to achieve acceptable results in detecting even synthetic OOD samples. Such limited performance might indicate that DL often inefficiently embeds large volumetric images. We argue that using the intensity histogram of the original CT or MRI scan as embedding is descriptive enough to run OOD detection. Therefore, we propose a histogram-based method that requires no DL and achieves almost perfect results in this domain. Our proposal is supported two-fold. We evaluate the performance on the publicly available datasets, where our method scores 1.0 AUROC in most setups. And we score second in the Medical Out-of-Distribution challenge without fine-tuning and exploiting task-specific knowledge. Carefully discussing the limitations, we conclude that our method solves the sample-level OOD detection on 3D medical images in the current setting.

translated by 谷歌翻译

Hardware-efficient learning of quantum many-body states

Katherine Van Kirk , Jordan Cotler , Hsin-Yuan Huang , Mikhail D. Lukin

分类：机器学习

2022-12-12

Efficient characterization of highly entangled multi-particle systems is an outstanding challenge in quantum science. Recent developments have shown that a modest number of randomized measurements suffices to learn many properties of a quantum many-body system. However, implementing such measurements requires complete control over individual particles, which is unavailable in many experimental platforms. In this work, we present rigorous and efficient algorithms for learning quantum many-body states in systems with any degree of control over individual particles, including when every particle is subject to the same global field and no additional ancilla particles are available. We numerically demonstrate the effectiveness of our algorithms for estimating energy densities in a U(1) lattice gauge theory and classifying topological order using very limited measurement capabilities.

translated by 谷歌翻译

A Neural Network Approach for Selecting Track-like Events in Fluorescence Telescope Data

Mikhail Zotov , Denis Sokolinskii

分类：机器学习

2022-12-07

In 2016-2017, TUS, the world's first experiment for testing the possibility of registering ultra-high energy cosmic rays (UHECRs) by their fluorescent radiation in the night atmosphere of Earth was carried out. Since 2019, the Russian-Italian fluorescence telescope (FT) Mini-EUSO ("UV Atmosphere") has been operating on the ISS. The stratospheric experiment EUSO-SPB2, which will employ an FT for registering UHECRs, is planned for 2023. We show how a simple convolutional neural network can be effectively used to find track-like events in the variety of data obtained with such instruments.

translated by 谷歌翻译

Weisfeiler and Leman Go Relational

Pablo Barcelo , Mikhail Galkin , Christopher Morris , Miguel Romero Orth

分类：机器学习 | 神经与进化计算 | (统计)机器学习

2022-11-30

Knowledge graphs, modeling multi-relational data, improve numerous applications such as question answering or graph logical reasoning. Many graph neural networks for such data emerged recently, often outperforming shallow architectures. However, the design of such multi-relational graph neural networks is ad-hoc, driven mainly by intuition and empirical insights. Up to now, their expressivity, their relation to each other, and their (practical) learning performance is poorly understood. Here, we initiate the study of deriving a more principled understanding of multi-relational graph neural networks. Namely, we investigate the limitations in the expressive power of the well-known Relational GCN and Compositional GCN architectures and shed some light on their practical learning performance. By aligning both architectures with a suitable version of the Weisfeiler-Leman test, we establish under which conditions both models have the same expressive power in distinguishing non-isomorphic (multi-relational) graphs or vertices with different structural roles. Further, by leveraging recent progress in designing expressive graph neural networks, we introduce the $k$-RN architecture that provably overcomes the expressiveness limitations of the above two architectures. Empirically, we confirm our theoretical findings in a vertex classification setting over small and large multi-relational graphs.

translated by 谷歌翻译